scene parsing and motion dynamic
Reviews: Predicting Scene Parsing and Motion Dynamics in the Future
The paper proposes a deep-learning-based approach to joint prediction of future optical flow and semantic segmentation in videos. The authors evaluate the approach in a driving scenario and show that the two components - flow prediction and semantic segmentation prediction - benefit from each other. The paper is related to works of Jin et al. and Neverova et al. However, as far as I understand, both of these have not been officially published at the time of submission (and the work of Neverova et al. Detailed comment: Pros: 1) The idea seems sound: predicting segmentation and optical flow are both important tasks, and they should be mutually beneficial.
Predicting Scene Parsing and Motion Dynamics in the Future
Jin, Xiaojie, Xiao, Huaxin, Shen, Xiaohui, Yang, Jimei, Lin, Zhe, Chen, Yunpeng, Jie, Zequn, Feng, Jiashi, Yan, Shuicheng
It is important for intelligent systems, e.g. Predicting the future scene parsing and motion dynamics helps the agents better understand the visual environment better as the former provides dense semantic segmentations, i.e. what objects will be present and where they will appear, while the latter provides dense motion information, i.e. how the objects move in the future. In this paper, we propose a novel model to predict the scene parsing and motion dynamics in unobserved future video frames simultaneously. Using history information (preceding frames and corresponding scene parsing results) as input, our model is able to predict the scene parsing and motion for arbitrary time steps ahead. More importantly, our model is superior compared to other methods that predict parsing and motion separately, as the complementary relationship between the two tasks are fully utilized in our model through joint learning.